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A RENEWAL  DECISION  PROBLEM 


by 

C.  Derman,  G.  J.  Lieberman  and  S.  Ross 
0.  Statement  of  Problem 

A system  must  operate  for  t units  of  time.  A certain  component  is 
essential  for  its  operation  and  must  be  replaced,  when  it  fails,  with  a new 
component.  The  class  of  spare  components  is  grouped  into  n categories  with 
components  of  the  ith  category  costing  a positive  amount  and  functioning 

for  an  exponential  length  of  time  with  rate  X^.  The  main  problem  of  interest 
is,  for  a given  t,  to  assign  the  initial  component  and  subsequent  replacements 
from  among  the  n categories  of  spare  components  so  as  to  minimize  the  expected 
cost  of  providing  an  operative  component  for  t units  of  time. 

In  Section  1 we  show  that  when  there  are  an  infinite  number  of  spares 
of  each  category,  the  optimal  policy  has  a simple  structure.  Namely,  the  time 
axis  can  be  divided  up  into  n intervals,  some  of  which  may  be  vacuous,  such 
that  when  a replacement  decision  has  to  be  made  it  is  optimal  to  select  a spare 
from  the  category  having  the  ith  largest  value  of  XC  whenever  the  remaining 
time  falls  into  the  ith  closest  interval  to  the  origin.  In  Section  2 we  con- 
sider the  situation  where  n = 2 and  there  is  only  a single  spare  of  one 
category  and  an  infinite  number  of  the  other.  In  Section  3 we  consider  the 
case  where  there  is  only  a finite  number  of  spares  for  certain  of  the  categories 
under  the  assumption  that  a rebate  is  allowed  for  the  component  in  use  at  the 
end  of  the  problem.  In  Section  4 we  allude  to  a generalization  of  the  model  in 
Section  1 allowing  for  discounting  or  for  the  possibility  that  the  system  may 
randomly  terminate  before  the  t units  of  time  expire.  An  optimal  policy  has 
the  same  simple  structure  as  in  Section  1. 


1 . Infinite  Surplus  in  All  Categories 

In  this  section  we  suppose  that  our  surplus  of  spare  parts  contains  an  in- 
finite number  of  each  category,  and  we  number  them  so  that  A^  > A9  > ...  > 

A C . In  addition,  we  suppose  that  there  is  no  i and  j such  that  C,  > C. 
n n j — i 

and  Aj  _>  A^;  for  if  such  is  the  case  it  can  be  shown  (see  Proposition  2)  that 
category  j need  never  be  used. 

Letting  V(t)  denote  the  infimal  expected  additional  cost  incurred  when 
there  are  t time  units  to  go  and  a failure  has  just  occurred,  then  V(t)  satisfies 
the  optimality  equation 

t -Ax 

V(t)  = min  {C  + / V(t-x)  A.e  dx)  , t > 0 (1) 

i=l n 0 

and  V(0)  = 0 . 

In  addition,  the  policy  which  chooses,  when  t time  units  are  remaining,  a spare 
from  a category  whose  number  minimizes  the  right  side  of  the  optimality  equation 
is  an  optimal  policy.  (This  is  a standard  result  in  dynamic  programming  when 
all  costs  are  assumed  non-negative  (see  [3],  [ 4 ]). 


Proposition  1: 

V(t)  is  an  increasing,  continuous  function  of  t for  t > 0 . 


Proof : The  increasing  part  follows  from  the  definition  of  V(t)  since  all 

costs  are  assumed  non-negative.  To  prove  continuity  suppose  that  it  is  optimal 
to  select  a spare  from  category  i whenever  there  are  t units  of  time  remaining. 
Then  by  selecting  this  same  category  at  time  t + e we  see,  from  the  lack  of 
memory  of  the  exponential,  and  the  monotonicity  of  V that 


V(t)  < V(t  + e)  < e Xie  V(t)  + (1  - e-AlE)  (Ct  + V(t  + e))  . 


Hence,  the  result  is  given.  Q.E.D. 


Theorem  1: 


! 


V(t)  is  a piecewise  linear  concave  function  of  t having  at  most  n pieces. 

Proof : Consider  any  value  t > 0 . Suppose  the  assignment  of  category  i 
when  t units  of  time  remain  is  uniquely  optimal.  Then  by  the  continuity  of 
V and  the  optimality  equation  (1)  there  is  an  interval  (t,  t + e),  e > 0, 
such  that  i is  uniquely  optimal  at  every  point  in  (t,  t + e)  . Suppose 
several  categories  are  optimal  at  t . Then  the  expressions  within  the  brackets 
of  (1)  corresponding  to  each  of  the  optimal  categories  are  all  equal  to  V(t)  . 

If  i is  optimal  the  derivative  of  the  expression  with  respect  to  t corres- 
ponding to  category  i is 


dt*Ci  + / V(t  - x)  \±  e 1 dx}  - V(y)  Xt  e 1 

t -X.(t-y) 

- A V (t)  - A J V (y)  A e dy 

X 0 1 


-A,(t-y) 


dy} 


-A.x 


V(t)  - A±  / V(t  - x)  A±  e 1 


dx 


(2) 


A±  V(t)  - X±(V(t)  - C1) 


S* 


xi  Ci  ; 


c 


■A, 


the  derivative  existing  since  V(t)  is  continuous.  It  follows  that  among  those 
categories  that  are  optimal  at  t that  category  j with  the  smallest  A^ 
will  be  uniquely  optimal  over  some  interval  (t,  t + e'),  e'  >0.  Since  at 
each  change  (as  t Increases)  of  optimal  category  a category  with  a smaller  XC 
becomes  optimal,  there  can  be  at  most  n values  of  t where  a change  in  optimal 
category  takes  place.  Since  is  constant  within  the  intervals  where  one 

category  is  optimal,  V(t)  is  linear  within  the  interval;  it  is  concave  because 
the  derivatives  are  non- increasing.  Q.E.D. 
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Remark;  Having  established  that  an  optimal  policy  employs  the  same  category  over 
intervals,  the  linearity  and  concavity  of  V(t)  can  be  deduced,  as  well,  from 
the  memoryless  property  of  the  exponential  distribution  and  the  linearity  of 
the  renewal  function  of  the  Poisson  process. 

It  follows  from  the  proof  of  Theorem  1 that  the  optimal  policy  uses 
category  1 spares  when  the  time  remaining  is  small,  then  switches  to  category 
2 spares  as  the  time  increases,  then  category  3 spares  as  the  time  further  in- 
creases, etc.  where,  of  course,  the  interval  of  use  for  some  categories  may  be 
empty.  This  suggests  two  possible  algorithms  for  finding  switching  points. 


Algorithm  1: 

For  this  algorithm  let  denote  the  minimal  expected  cost  function 

and  let  i?  denote  the  optimal  policy,  when  only  categories  1,  2,  ....  i are 
available.  For  instance 

v:(t)  = cx(i  + x1t),  0 < t < -, 


and  ir^  is  the  policy  which  always  replaces  with  a spare  from  category  1. 
From  our  previous  structural  results  it  follows  that  ir  will  use  category 
i whenever  the  time  remaining  is  at  least  some  finite  critical  value  t^  ^ 
Now  at  follows,  by  continuity,  that  it  is  optimal  either  to  use 

category  i and  then  proceed  optimally,  or  to  just  use  . Hence,  if 

tj_^  > 0,  then 

ti-l  -X  x 

" Ci  + / Vi-l*Ci-l  " x)  Xi  e dx  • (3) 


Furthermore,  since  it  follows  from  the  optimality  equation  that  for  small 
values  of  t,  chooses  the  category  with  minimal  value  of  C^,  we  obtain 
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Hence,  unless  this  is  the 


that  t.  , • 0 if  and  only  if  C * min  C,  . 

1-1  1 l<k<i  k 

case,  t^  1 can  be  taken  to  be  the  smallest  positive  solution  of  (3). 
And,  in  addition,  we  have 


k- 

V«  ■ .. 


(t) 


c ^ Vi 


and 


+ h ci<,;  - W ‘ i h-i 


uses  category  i whenever  t > tj  ^ and  follows  ^ when 


t ± Vl  ' 


For  example  when  < C2»  ^1  C1  > ^2  C2*  t*lis  algorithm  yields 


that  t^  is  chosen  so  that 


-X2x 


:1(1  + Xltl)  " C2  + Cj.  / 1 (I  + A1(t1  - x)]  X„  e ‘ dx 


Simplifying, 


1 X, 


log 


C1X1  _ C1X2 


C1X1  " C2X2 


The  expression  for  V2(t)  can  be  written  as 


V„(t)  = C.(l  + X.t) 


t^tl 


C1  + (X1C1  - X2C2)  C1  + X2  C2  C 


Algorithm  2: 

Let  » min{C^,  ....  C^}  . For  some  value  t^  > 0,  category 

i^  is  used  whenever  0 < t <_  t^  , To  find  t^,  for  every  value  of 
i,  i > i^,  determine  x^,  the  smallest  value  of  x,  x ^ 0 satisfying 
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V(t)  = C.  (X  t + 1)  = u (t)(say)  for  t >_  0 


For  some  value  t 


for  every  value  of  i,  i > i 


For  some  value  of  t 


for  every  value  of 


< t < t.  for  some  value  t 


Let  t * min  {x. } = x.  . This  process  stops  when  t.  * “ . Of  course 
k i>ik  1 Stfl  k 

if  i^  = n,  then  t^  = ® . 

Both  algorithms  automatically  exclude  those  categories  which  should 
never  be  used.  However,  it  is  possible  to  eliminate  some  in  advance.  This 
is  indicated  in  the  following. 

Proposition  2. 

If  C >_  Ct,  A^  > A^(C^  > Aj  > A^),  then  category  j is  never 
used  in  an  optimal  policy. 

Proof : Let  t > 0 be  arbitrary.  Let  ir^  be  the  policy  that  uses  category 

j at  t and  subsequently  assigns  categories  optimally.  Let  be  the 

policy  that  uses  category  i and  subsequently  assigns  categories  optimally. 

On  comparing  and  we  have 

t -A.x  t -Ax 

V (t)  - V (t)  - (C  - C.)  + (/  V(t-x)  A e 3 d x - / V(t-x)  A,  e 1 dx)  . 

1 2 3 1 0 3 0 

The  first  expression  on  the  right  is  non-negative  (positive)  by  assumption. 

00 

f — Xx 

The  second  expression  is  positive  (non-negative)  since  J f(x)  A e dx 

0 

is  increasing  in  A for  every  non-constant  non- increasing  function  f ; V(t  - x) 

is  such  a function  in  x,  as  seen  by  letting  V(t  - x)  =0  for  x > t . 

Thus  V (t)  - V (t)  > 0 . Since  the  use  of  category  j can  always  be  im- 
*1  "2 

proved  upon  by  using  category  i its  use  can  never  be  optimal.  Q.E.D. 

Remark:  It  is  also  intuitive  that  if,  for  some  i and  j,  A^  < A^ 

and  < Cj , then  category  j will  never  be  used.  However,  while  this  is 
evident  from  the  formula  for  t^  in  the  case  of  n * 2f  and  implies  it  is 
true  for  n * 3,  we  have  not  been  able  to  prove  it  in  general. 
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2.  Finite 


finite  supply  of  spares  of  one  category  and  only  one  of  the  other  category 


Theorem  2 


If  the  set  of  spare  components  consists  of  one  component  of 


category  1 and  an  infinite  number  of  category  2,  where  C„  > C 


the  optimal  policy  is  to  use  a category  2 component  when  the  time  re 


maining  t is  greater  than  t and  use  the  category  1 component  when 


Proof : Since  once  the  decision  to  use  the  category  1 item  is  made  there 


are  no  further  decisions,  it  follows  that  one  may  regard  this  as  a 


stopping  rule  problem  where  stopping  means  the  use  of  the  category  1 item 


The  one-stage  look  ahead  stopping  policy  stops  at  t whenever  stopping 


at  t is  better  than  continuing  at  t and  then  stopping  at  the  next  op 


portunity.  Now  letting  X denote  the  lifetime  of  the  category  1 component 


and  V that  of  the  first  category  2 component  used  then  W the  expected 


cost  of  using  1 at  t,  is 


while  W„  the  expected  cost  of  first  using  a category  2 at  t and  then  using 


the  category  1 component,  is  given  by 


Hence 


Wi  - w2  = ClP{X  > t}  + (Cx  + c2)  P{X  < t,  X + V > t} 

- (C.  + C2)  P{V  < t,  X + V > t} 


C2P{V  > t) 


CjPfV  > t)  - C2P{X  > t} 


-At  -At 

= Cx  e Z - c2  e 


Therefore,  the  one  stage  look  ahead  policy  uses  the  category  1 component 
at  t whenever 

-At  -At 

Cx  e z - C?  e <0 


or,  equivalently,  whenever 

log(C2/C1) 


t < 


- Ax  - A2 


if  \1  > A2 


or 


t = 


if  \1  < A2 


Since  these  sets  of  time  points  at  which  the  one-stage  look  ahead  policy 
stops  can  never  be  left  once  entered  (without  stopping),  it  follows  (see 
[ 1 ] or  [2])  that  it  is  an  optimal  policy.  Q.E.D. 

Remark:  The  above  proof  does  not  need  the  assumption  of  exponential  dis- 

tributions for  X and  V . The  same  form  of  policy  is  optimal  if  there  is 
a t such  that 

C1  P{V  > t>  - C2  P{X  > t}  < 0 for  t _<  t 

(>  0)  (t  > t) 
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If  the  failure  rate  function  of  X is  always  greater  than  the  failure  rate 
function  of  V then  such  a t (possibly  <*>)  will  exist  since  then 
P(V  > t)/P(X  > t)  is  a non-decreasing  function  of  t . 


Theorem  3; 

If  the  set  of  spare  components  is  an  infinite  number  of  cate- 
gory 1 and  one  of  category  2,  where  C 2 > C^,  X 2 < X^,  then  the  optimal 
policy  is  to  use  the  category  2 component  when  the  time  remaining  t is 
greater  than  x and  use  a category  1 component  when  t _<  x where 


x = y log 

A2 


C1X1  - C1X2 
C1X1  " C2X2 


Note:  As  we  might  expect  x is  the  same  as  in  the  infinite  supply  model 

when  n = 2 . 


Proof : Using  the  stated  policy  the  expected  cost  function  is 


u(t)  = C^(X^t  + 1)  if  t x 

t ~x2y 

= C + J C.  [X  (t  - y)  + 1]  X e dy  if  t > x . 
z 0 

It  is  somewhat  tedious  but  possible  to  verify  that  u(t)  satisfies 

t -X  y t 

u(t)  = min{ C.  + J u(t  - y)  X e dy,  C9  + / C,  [X  (t  - y]  X 
0 0 


That  is,  u(t)  satisfies  the  optimality  equations  for  this  problem;  it 
thus  follows  from  Proposition  1 of  [3]  that  this  policy  is  optimal.  Q.E.D. 
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3 . Finite  Supply  with  Rebate  Problem 


Suppose  in  some  of  the  n categories  there  are  only  a finite  number  of 
components.  We  assume  that  the  components  are  numbered  j = 1,  2,  ...  . 
However,  in  contrast  to  the  previous  problems,  the  cost  of  the  last  component 
used  is  returned.  The  problem,  again,  is  to  determine  a policy  for  deciding 
which  available  component  to  use  when  a new  component  is  required,  so  as  to 
minimize  the  total  expected  cost. 

If  t units  of  time  remain  when  a particular  component  from  category  i 
is  put  into  use  and  L is  the  length  of  its  life,  then  the  expected  cost 
associated  with  the  use  of  this  component  is 


E(cost  of  component | t)  = CjP{L  < t} 


= CiXi 


1 - e 


-V 


= E[min(L,t)] 


= C^A^  EfLength  of  time  this  component  is  used[t]  . 


Thus,  letting  6(j)  denote  the  category  to  which  component  j belongs  it 
follows  that  the  total  expected  cost  under  any  policy  tt  is 


E^ [total  cost]  = £ A 


«(j)  C<5(j)  Ett[tJ  1 


where  t j is  the  amount  of  time  component  j is  used,  and  where  the  sum 
is  taken  over  all  components. 

Consider  now  a modified  problem  where  the  cost  associated  with  compo- 
nent j is  A./..  C./.v  T.  where  once  again  t,  is  defined  as  the  amount 
of  time  that  component  j is  in  use,  j _>  1 . The  total  expected  cost  with 
respect  to  the  modified  problem  where  policy  it  is  used  is  precisely  the 
same  as  the  total  expected  cost  with  respect  to  the  original  problem.  Thus, 
the  policy  that  is  optimal  for  the  modified  problem  is  optimal  for  the  original 
problem.  However,  it  i3  clear  that  the  policy  that  used  the  category 
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associated  with  the  minimal  available  X^  is  optimal  for  the  modified 
problem.  Thus  we  have  proved  the  following: 

Theorem  4: 

The  policy  that  minimizes  the  total  expected  cost  when  a rebate 
is  given  for  the  last  component  used  is  that  policy  that  always  selects 
among  all  the  available  categories  that  one  having  the  smallest 

Xi  Ci  * 

Remark  1:  The  same  problem  can  be  thought  of  in  a different  context  except 

that  it  leads  to  a maximization  problem  instead  of  a minimization  one. 

There  are  n jobs  to  be  performed  sequentially  within  a fixed  time  t - 
the  ith  job  takes  an  exponential  amount  of  time  with  mean  1/X^  and  if 
completed  within  the  time  span  of  the  problem  earns  the  decision  maker  an 
amount  . Whenever  a job  is  completed  the  decision  maker  must  decide 
which  job  to  attempt.  He  wishes  to  maximize  the  total  expected  earnings. 
The  modified  problem  has  the  decision  maker  earn  X^  if  units 

of  time  are  spent  on  the  ith  job  whether  or  not  it  is  completed  within 
the  time  span  of  the  problem.  The  decision  maker  wishes  to  maximize  total 
expected  earnings.  \s  before,  the  optimal  policy  for  the  two  problems 
is  identical,  namely,  always  choose  the  job  with  maximum  X^  . 


r 
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Remark  2:  The  conclusions  for  the  problem  in  this  section  obviously  hold 


if  each  category  contains  an  infinite  number  of  components.  Since  the  op- 
timal expected  total  cost  must  be  less  than  the  optimal  expected  cost  for 
the  problem  discussed  in  Section  1 and  since  always  using  category  n in 
that  problem  is  not  necessarily  optimal  we  have  the  inequalities 


X 

n 


C 

n 


t < V(t)  < X 

— — n 


C 

n 


t + C . 
n 


Actually,  better  bounds  can  be  obtained.  If  the  rebate  given 
last  component  used  is  when  the  last  component  used 
category  j,  it  is  still  optimal  to  use  category  n for  each 
ment.  One  then  arrives  at 


for  the 
is  from 
replace- 


X C t + C,  < V(t). 
n n 1 — 


Remark  3;  Since  the  optimal  policy  is  independent  of  the  remaining  time 
it  follows  that  this  policy  is  optimal  when  the  time  horizon  is  a random 
variable  having  any  arbitrary  distribution. 
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Suppose  in  Section  1 discounting  is  appropriate,  i.e.,  for  some 


a > 0 

t -X  .x 

V(t)  = min(C  + / e °X  V(t  - x)  A.  e dx}  . 
i 1 0 1 

This  can  be  interpreted  in  the  usual  economic  sense  or  as  a random  termina- 
tion model.  The  latter  interpretation  arises  when  the  system,  in  addition 
to  being  terminated  definitely  t units  in  the  future,  may  also  be  termi- 
nated due  to  a randomly  occurring  accident;  the  time  until  such  an  accident 
occurs  has  an  exponential  distribution  with  mean  1/a  . 

The  methodology  of  Section  1 applies  in  this  case.  Proposition  1 
holds  for  V(t)  . The  derivative  corresponding  to  (2)  becomes 
(X^  + a)  - aV(t)  . With  X^  replaced  by  (X^  + a)  the  same 
type  of  statement  concerning  the  structure  of  an  optimal  policy  holds. 

The  segments  will  no  longer  be  piecewise  linear;  however,  V(t)  will  still 
be  concave  (the  segments  being  the  appropriate  solution  to  the  linear 
differential  equation  V' (t)  = (X  + a)  C - aV(t))  . 
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Statement  of  Problem 

J Ulk-ttv 

Vt.'jw  A system  moat-  operates  for  t units  of  time.  A certain  component  is 
essential  for  its  operation  and  must  be  replaced,  when  it  fails,  with  a new 
component.  The  class  of  spare  components  is  grouped  into  n categories  with 
components  of  the  ith  category  costing  a positive  amount  C and  functioning 
for  an  exponential  length  of  time  with  rate  X^.  The  main  problem  of  interest 
is,  for  a given  t,  to  assign  the  initial  component  and  subsequent  replacements 
from  among  the  n categories  of  spare  components  so  as  to  minimize  the  expected 

cost  of  providing  an  operative  component  for  t units  of  time.  ^ 

In  Section  1 we  show  that  when  there  are  an  infinite  number  of  spares 
of  each  category,  the  optimal  policy  has  a simple  structure.  Namely  the  time 
axis  can  be  divided  up  into  n intervals,  some  of  which  may  be  vacuous,  such 
that  when  a replacement  decision  has  to  be  made  it  is  optimal  to  select  a spare 
from  the  category  having  the  ith  largest  value  of  XC  whenever  the  remaining 
time  falls  into  the  ith  closest  interval  to  the  origin.  In  Section  2 we  con- 
sider the  situation  where  n - 2 and  there  is  only  a single  spare  of  one 
category  and  an  infinite  number  of  the  other.  In  Section  3 we  consider  the 
case  where  there  is  only  a finite  number  of  spares  for  certain  of  the  categories 
under  the  assumption  that  a rebate  is  allowed  for  the  component  in  use  at  the 
end  of  the  problem.  In  Section  4 we  allude  to  a generalization  of  the  model  in 
Section  1 allowing  for  discounting  or  for  the  possibility  that  the  system  may 
randomly  terminate  before  the  t units  of  time  expire.  An  optimal  policy  has 
the  same  simple  structure  as  in  Section  1. 


